Automatic identificaiton of compounds in a sample mixture by means of nmr spectroscopy

ABSTRACT

A process for quantitative and qualitative analysis for identifying compounds in a sample mixture involves the identification of a set of reference spectra selected according to a measured condition, e.g. pH, of the sample, which collectively define a composite spectrum which best matches a spectrum produced from the sample. The compounds associated with respective reference spectra of the identified set are the compounds that are determined to be likely to be present in the sample. Quantities of the compounds may be determined from the intensities of certain representative peaks associated with the compounds, relative to the intensity of a peak associated with a reference compound which is unaffected by the measured condition of the sample. Thus, given a test spectrum of a sample and given a set of reference spectra, the process can identify and quantify compounds present in the sample.

BACKGROUND OF THE INVENTION

[0001] 1. Field of Invention

[0002] This invention relates to qualitative and quantitative chemicalanalysis, and more particularly to processes, apparatus, media andsignals for automatically identifying compounds in a sample.

[0003] 2. Description of Related Art

[0004] The field of biometric identification has grown tremendously overthe recent decade both from its relevance to medical diagnostics and toits application as a way to uniquely identify a person or an animal, forexample. As diagnostic tools have become more sophisticated, complexliquid mixtures, such as human blood or urine for example, can now beanalyzed to identify or search for particular compounds that can provideimportant diagnostic information to a medical technician or a doctor.

[0005] Generally, the separation and characterization of mixtures isfundamental to nearly every aspect of analytical chemistry andbiochemistry. Most approaches to identify and quantify biologicalcompounds in liquid mixtures require an initial compound separation(chromatographic or physical separation) step to separate a particularcompound or set of compounds from the mixture. For example, gaschromatography, electrophoresis, and liquid chromatography are used toseparate pure chemical components/compounds, for example, from a mixturebefore analysis is performed. Initial compound separation is requiredbecause most spectral identification processes, such as massspectrometry or infrared, visible, and ultraviolet spectroscopy, requirerelatively pure samples in order to minimize noise and increase theaccuracy of the measuring device. Spectral identification processes areexpensive, manually intensive and require a great deal of technicalexpertise to be performed properly in an accurate, timely manner.

[0006] Nuclear magnetic resonance (NMR) has recently been shown to be analternative approach to identify and quantify biological compoundswithout chromatographic separation. In this approach, radio frequency(RF) electromagnetic radiation is applied to a mixture of organiccompounds to extract and measure a characteristic RF absorption spectrumof nuclei belonging to each specific organic compound. A large number ofcompounds are associated with well-defined peaks in the absorptionspectrum and knowing which peaks are associated with certain compoundsmakes it possible to manually identify some of the compounds in theliquid mixture without resorting first to chromatographic separation.However, this process is still quite slow and requires a great deal of apriori information that relates each peak to a given compound. It cantake a number of years for experts in NMR spectroscopy to acquire theknowledge required to analyze NMR spectra to accurately identify andquantify compounds in sample mixtures.

[0007] Therefore what is desired is a process and apparatus for quickly,accurately and automatically identifying a number of compounds which maybe present in complex liquid mixtures without involving chromatographicseparation and without requiring people who are experts in NMRtechniques.

SUMMARY OF THE INVENTION Overall Process

[0008] The embodiments of the invention disclosed herein provide forautomated, accurate analysis of a test spectrum obtained from a sample,to quantitatively and qualitatively identify compounds present in thesample.

[0009] In accordance with one aspect of the invention there is provideda computer-implemented process for automatically identifying compoundsin a sample mixture, th process comprising receiving a representation ofa measured condition of the sample mixture, using said representation ofa measured condition of the sample mixture to select a set of referencespectra of compounds suspected to be contained in said sample mixture,from a library of reference spectra, receiving a representation of atest spectrum having peaks associated with compounds therein, said testspectrum being produced from the sample mixture under said measuredcondition, and combining reference spectra from said set of referencespectra to produce a matching composite spectrum having peaks associatedwith at least some of said suspected compounds, that match peaks in saidtest spectrum, the compounds associated with the reference spectra thatcombine to produce the matching spectrum being indicative of thecompounds in the sample mixture.

[0010] In accordance with another aspect of the invention, there isprovided a computer-readable medium for providing computer readableinstructions for directing a processor circuit to execute the processdescribed above.

[0011] In accordance with another aspect of the invention, there isprovided a signal embodied in a carrier wave, the signal having codesegments for providing computer readable instructions for directing aprocessor circuit to execute the process described above.

[0012] In accordance with another aspect of the invention, there isprovided an apparatus for identifying compounds in a sample. Theapparatus includes a processor circuit programmed to execute the processdescribed above.

[0013] In accordance with another aspect of the invention there isprovided an apparatus for identifying compounds in a sample, theapparatus comprising means for receiving a representation of a measuredcondition of the sample mixture, means for using said representation ofa measured condition of the sample mixture to select a set of referencespectra of compounds susp cted to be contained in said sample mixture,from a library of reference spectra, means for receiving arepresentation of a test spectrum, produced from the sample mixtureunder said measured conditions, and means for combining referencespectra from said set of reference spectra to produce a matchingcomposite spectrum having peaks representing at least some of saidsuspected compounds, that match peaks said test spectrum, the compoundsassociated with the reference spectra that combine to produce thematching spectrum being the compound in the sample mixture.

[0014] In accordance with another aspect of the invention there isprovided a process for producing a trace file for use in spectrumanalysis. The process involves performing a Fourier Transform on FreeInduction Decay (FID) data to produce an initial spectrum, filtering aselected region of the initial spectrum to produce a filtered spectrumand phasing the filtered spectrum to produce a measured spectrum havinga flat baseline and well defined positive peaks.

[0015] In accordance with another aspect of the invention there may beprovided a computer readable medium and/or a signal for providing codesoperable to direct a processor circuit to produce a trace file for usein spectrum analysis according to the process described above.

[0016] In accordance with another aspect of the invention there isprovided an apparatus for producing a trace file for use in spectrumanalysis, the apparatus has a device for automatically performing aFourier Transform on Free Induction Decay (FID) data to produce aninitial spectrum, a device for automatically filtering a selected regionof the initial spectrum to produce a filtered spectrum and a device forautomatically phasing the filtered spectrum to produce a measuredspectrum having a flat baseline and well defined positive peaks.

[0017] In accordance with another aspect of the invention there isprovided a process for producing a representation of a spectrum for ahypothetical solution containing a compound, for use in determining thecomposition of a test sample. The process involves producing a positionvalue for at least one peak of a reference spectrum as a function of acondition of the test sample, and a property of the at least one peak ina base reference spectrum.

[0018] In accordance with another aspect of the invention there isprovided a computer-readable medium for providing computer readableinstructions for causing a processor circuit to execute the process forproducing a representation of a spectrum for a hypothetical solution asdescribed above.

[0019] In accordance with another aspect of the invention there isprovided a signal having a segment comprising codes operable to cause aprocessor circuit to execute the process for producing a representationof a spectrum for a hypothetical solution as described above.

[0020] In accordance with another aspect of the invention there isprovided an apparatus for executing the process for producing arepresentation of a spectrum for a hypothetical solution describedabove. The apparatus has a processor circuit programmed to produce aposition value for at least one peak of a reference spectrum as afunction of a measured condition of the test sample, and a property ofthe at least one peak in a base reference spectrum.

[0021] In accordance with another embodiment, there is provided anapparatus for producing a representation of a spectrum for ahypothetical solution containing a compound, for use in determining thecomposition of a test sample under a certain condition. The apparatushas a device for receiving a value representing a measured condition ofthe test sample, a device for receiving a representation of a positionof at least one peak in a base reference spectrum and a device forproducing a position value for at least one peak of a derived referencespectrum as a function of the measured condition of the test sample, anda property of the at least one peak in a base reference spectrum.

[0022] Other aspects and features of the present invention will becomeapparent to those ordinarily skilled in the art upon review of thefollowing description of specific embodiments of the invention inconjunction with the accompanying Figures.

BRIEF DESCRIPTION OF THE DRAWINGS

[0023] In drawings which illustrate embodiments of the invention,

[0024]FIG. 1 is a system for determining the quantity of compounds in atest sample, according to a first embodiment of the invention;

[0025]FIG. 2 is a flow chart illustrating an automatic process forconditioning a measured spectrum, as implemented by a workstation shownin FIG. 1;

[0026]FIG. 3 is a pictorial representation of a measured spectrumproduced by the workstation shown in FIG. 1;

[0027]FIG. 4 is a flow chart of a routine executed on the workstationshown in FIG. 1, for conditioning the measured spectrum to suppress apeak caused by a solvent in a sample for which the measured spectrum isproduced;

[0028]FIG. 5 is a flow chart of a process for identifying compoundsexecuted by a spectrum analysis apparatus shown in FIG. 1;

[0029]FIG. 6 is a pictorial representation of a reference spectrumassociated with lactic acid at pH of 5.10;

[0030]FIGS. 7A and 7B are a tabular representation of an ExtensibleMarkup Language (XML) file representation of the reference spectrum ofFIG. 6;

[0031]FIG. 8 is a flow chart of a process by which base referencespectrum records such as shown in FIGS. 7A and 7B may be produced;

[0032]FIG. 9 is a process executed by the spectrum analysis apparatusshown in FIG. 1 to identify a peak associated with a calibrationcompound in a test spectrum;

[0033]FIGS. 10A and 10B are a flow chart of the process for identifyingcompounds, shown in FIG. 5, in greater detail;

[0034]FIG. 11 is a flow chart of a process executed by the spectrumanalysis apparatus for determining a pH value from the test spectrum;

[0035]FIG. 12 is a flow chart of a process executed by the spectrumanalysis apparatus for producing a derived reference spectrum;

[0036]FIGS. 13A and 13B are a tabular representation of a base referencespectrum record associated with lactic acid at a pH of 5.45;

[0037]FIGS. 14A and 14B are a tabular representation of a derivedreference spectrum record associated with lactic acid at a pH of 5.28;

[0038]FIGS. 15A and 15B are a tabular representation of a generic typeof derived reference record in which equations specify center Parts PerMillion (PPM) values for peak clusters, according to one embodiment ofthe invention;

[0039]FIGS. 16A and 16B are a tabular representation of a derived recordcomprising look-up table links to center PPM values according to anotherembodiment of the invention;

[0040] FIGS. 17 is a flow chart of a process for determining an upperbound concentration estimate;

[0041]FIG. 18 is a flow chart of a least squares fitting routinereferenced by FIG. 10B;

DETAILED DESCRIPTION

[0042] Referring to FIG. 1, a system, according to a first embodiment ofthe invention, for determining the quantity of compounds in a testsample is shown generally at 10. The system includes a spectrumproducing apparatus 12 and a spectrum analysis apparatus shown generallyat 14. In this embodiment, the spectrum producing apparatus 12 is aNuclear Magnetic Resonance (NMR) System provided by Varian Inc. ofCalifornia, U.S.A. Generally, the system is operable to receive aspecially prepared liquid biological test sample and produce a data filecomprised of a plurality of (x,y) values which define a measured NMRspectrum. This measured NMR spectrum is then supplied to the spectrumanalysis apparatus 14, where a process according to another aspect ofthe invention is carried out to provide an indication of the quantitiesof certain compounds in the specially prepared biological test sample.

[0043] The system 10 is suitable for use with biological samples, forexample blood or urine, in which the solvent is water, for example. Suchsamples may be “prepared” by doping them with a small quantity of acondition indicator compound, also referred to as a condition referencecompound, and a chemically inert chemical shift calibration standardcompound also referred to as a calibration compound. The conditionindicator may be trimethylsilyl-1-propanoic acid or Imidazole, where thedistortion factor is pH, for example. Alternatively, the sample itselfmay have a naturally occurring, inherent condition indicator such asglycine, creatinine, urea, citrate, or trimethylamine-N-oxide, forexample. The chemical shift calibration standard compound may be3-[trimethylsilyl]-1-propanesulfonic acid, also known as DSS, forexample. Alternatively, the chemical shift calibration standard may bedimethylsulphoxide (DMSO), acetone, or tetramethylsilane (TMS), forexample.

[0044] In this embodiment, the spectrum producing apparatus 12 iscomprised of a computer workstation 16, an auto sampler 18, a testchamber 20, and a console 22. The workstation is a Sun Workstation witha 400 MHz UltraSPARC lli CPU with 2 MB level 2 cache, 128 MB RAM,on-board PGX24 graphics controller, 20 GB 7200 r.p.m. EIDE hard disk 48x CD-ROM drive 1.44 MB floppy drive and 17″ flat screen color monitor.The workstation runs Varian VNMR software which includes routines forcontrolling the auto sampler 18 and the console 22 to cause thespecially prepared biological liquid sample to be received in the testchamber 20 and to cause the console to acquire and provide to theworkstation Free Induction Decay (FID) data representing the freeinduction decay of electromagnetic radiation absorptions produced byprotons in the compounds of the liquid sample as a result of changes inmagnetic properties of the protons due to a nuclear magnetic resonanceprocess initiated in the test chamber 20 by the console 22.

Process for Producing a Measured Spectrum

[0045] The FID data is received and stored in memory at the workstation16. Then, in this embodiment, a process according to an embodiment ofanother aspect of the invention, is carried out to cause the workstationto produce a measured spectrum for use by the spectrum analysisapparatus 14. Instructions for directing the workstation toautomatically carry out the process for producing the measured spectrumare embodied in computer readable codes 24. These computer readablecodes 24 may be provided to the workstation 16 in a variety of differentforms including a file or files on a computer readable medium such as aCD-ROM 26, or floppy disk 28, for example, or as a file received as asignal from a communications medium such as an internet 30, extranet orintranet, electrical 32, Radio Frequency (RF) 34, or optical medium 36or any other medium by which a file comprised of said codes may beprovided to the workstation 16 to enable the workstation to be directedby the codes to execute the process described herein to produce ameasured spectrum.

Autoprocessing

[0046] Generally, an automatic computer-implemented process forproducing a measured spectrum from NMR data, may involve operating onfree induction decay (FID) data produced by a spectrometer to produce atrace file comprised of intensity and frequency values representing ameasured spectrum having a flat baseline and well defined peaks thathave positive, well-defined areas, for use in a computer-implementedspectrum analysis process such as the process described herein. Inparticular, the process may involve performing a Fourier Transform onFree Induction Decay (FID) data to produce an initial spectrum,filtering a selected region of the initial spectrum to produce afiltered spectrum and phasing the filtered spectrum to produce ameasured spectrum having a flat baseline and well defined positivepeaks.

[0047] Referring to FIG. 2, a flowchart depicting functional blocksimplemented by the codes to cause the workstation to execute a specificprocess for producing a measured spectrum is shown generally at 50. Theprocess begins with a first block 52 that causes the workstation 16 toread and perform an initial Weighted Fourier Transform on the FID datato produce an initial measured spectrum representing signal intensity(i) versus frequency (F).

[0048] Then block 54 causes the workstation 16 to produce parameters foruse in a later-executed Fourier Transform performed on the FID data toproduce a representation of a measured spectrum having well definedLorentzian lines with a flat baseline and peaks that have positive,well-defined areas. Thus, the result of block 54 is a set of parametersthat controls Fourier Transforms later performed on the FID data toproduce a representation of a measured spectrum.

[0049] Block 56 directs the workstation to save the set of parameters inassociation with the FID data. Block 58 directs the workstation 16 toperform a Fourier Transform on the FID data, using the parametersproduced by block 54 to produce a trace file, which is a file comprisedof a plurality of (x,y) values that represent a trace of the measuredspectrum, representing intensity versus frequency. Block 59 then causesthe workstation to save the trace file for transmission to the spectrumanalysis apparatus 14 shown in FIG. 1.

[0050] An example of a measured spectrum is shown generally at 41 inFIG. 3. The spectrum is a plot of intensity versus frequency. The x-axis43 is referenced to parts per million (ppm) and depicts a window of theoverall spectrum, the window containing relevant information or featuresfor identifying compounds in the sample. The y-axis 39 is referenced toa zero value and the spectrum has a baseline 37 representing a noiselevel from which a plurality of peaks 45, 47, 49, 51, 53, 55, 57, 59,61, 63 associated with various compounds in the sample extend. Forexample peaks 45 and 47 are associated with Imidazole, peak 49 isassociated with Urea and peaks 51 and 53 are associated with Creatinine.Peaks 55 and 57 form a first cluster associated with citric acid andpeaks 59 and 61 form a second cluster associated with that compound.Peak 63 is associated with DSS, the calibration compound.

[0051] Referring back to FIG. 2, block 54 which processes the FID data,is shown in greater detail. Block 54 includes sub-functional blocksincluding a Fourier Transform block 60, a filter selected and/or solventregion block 66 and an automatic phasing block 68, each of which isautomatically executed in turn, in the order shown. The process mayinclude an optional spectral window setting block 62 and an optionaldrift correction block 64, to further process the spectrum, for example.

[0052] The Fourier Transform block 60 has an optional sub-block 70 thatcauses the workstation to perform a weighted Fourier Transform withweights that provide for enhancement of the initial spectrum. Theseweights may perform a line broadening function to the initial spectrum,for example. To do this in this embodiment, block 70 causes theworkstation to set signal enhancement parameters for use in asubsequently executed weighted Fourier Transform block 72. Such signalenhancement parameters may effect line broadening, line narrowing, orgaussian sine-bell conditioning, for example, to the resulting spectrumproduced by the Fourier Transform block 72. In the Varian VNMR software,this is effected by setting a line broadening variable “lb” to aspecified value, which may be 0.5, for example. Also in the VNMRsoftware, the weighted Fourier Transform may be executed by calling theVNMR macro “wft” to perform a weighted Fourier Transform on the FIDdata, using the lb parameter value set at block 70. This has the effectof broadening the lines or peaks of the spectrum and averaging thespectrum to produce a measured spectrum with a better signal to noiseratio than would be produced without averaging. It also has the effectof eliminating glitches to produce a measured spectrum of betterquality.

[0053] In this embodiment optional block 62 causes the workstation 16 todefine a window on the initial spectrum and this may involve scaling theinitial spectrum. It is desirable to set the spectral window to a presetsize, i.e. a pre-defined range of frequency, to enable the acquisitionof repeatable data and for all useful data to be in a pre-defined windowand to scale the spectrum such that the height of its maximum peak is apercentage of the height of the window. In this embodiment, this iseffected through the VNMR software by executing three sub-functionalblocks 74, 76 and 78 that cause the workstation 16 to call the VNMRmacros “f”, “full’, and the VNMR command “vsadj”, respectively, in theorder shown. The ‘f’ macro sets display parameters “sp” and “wp” for afull display of a 1D spectrum, the ‘full’ macro sets display limits fora full screen so that the spectrum can be seen as wide as possible inthe window, and the ‘vsadj’ command sets up automatically the verticalscale “vs” in the absolute intensity mode “ai”, so that the largest peakis of the required height. Effectively this provides for scaling of thespectrum so that the highest peak is 90% of the total window height.

[0054] Optional block 64 causes the workstation to produce parametersthat perform drift correction on the spectrum to correct the measuredspectrum for drift effects, effectively setting the two extremes of thebaseline of the spectrum, i.e. the left and right sides of the spectrumto have zero slope. In this embodiment, using the Varian VNMR software,this is achieved by block 80 which causes the workstation 16 to call the“dc” macro of the VNMR software. Effectively the “dc” macro calculates alinear baseline correction. The beginning and end of a straight line tobe used for baseline correction are determined from the displayparameters “sp” and “wp”. The “dc” command applies this correction tothe spectrum and stores the definition of the straight line in theparameters “lvl” (level) and “tlt” (tilt) of the VNMR software. (cdcresets the parameters “lvl” and “tlt” to zero.)

[0055] Block 66 causes the workstation to filter a selected region ofthe spectrum to adjust the intensity of the spectrum in that region.Filtering may involve applying a notch filter to a selected or solventregion, for example, to suppress a peak associated with a contaminant orsolvent in the contaminant or solvent region. This ensures that thesolvent region or contaminant region of the spectrum is correctly phasedwith the rest of the spectrum so that the entire spectrum can beproperly phased later. In order to permit the entire spectrum to bephased, the solvent or contaminant residual must be in phase with therest of the spectrum, ideally reducing the solvent or contaminant regionto zero. The solvent region is the region of the spectrum in whichsolvent compounds in the sample may be found. For example the solventmay be water, in which case the region around the peak in the measuredspectrum associated with the compound H₂O is considered to be thesolvent region. The contaminant region is a region of the spectrum wherepeaks associated with contaminants are present.

[0056] Referring to FIG. 4, a routine for filtering the selected regionis shown generally at 66 and involves a first block 92 that causes theworkstation 16 to apply a notch filter to the selected region tosuppress a peak in that region. A set of initial notch filter parametersspecifying the attenuation, width and position of the notch filter isused.

[0057] Applying a notch filter may further involve producing an adjustedset of notch filter parameters and applying a notch filter employing theadjusted set of notch filter parameters to the selected region. The setof notch filter parameters may be adjusted to produce an adjusted set ofnotch filter parameters that may be applied to the notch filter tofilter the selected region until a sum of the absolute values of areasdefined by peaks above and below a baseline of the initial spectrum isminimized. In this embodiment this is done by block 94 which causes theworkstation to adjust the set of initial notch filter parameters andre-apply the notch filter until the sum of the absolute values of theareas of the spectrum in the selected region, is minimized. One quickway of doing this and minimizing the number of iterations of applicationof the notch filter is to employ numerical methods to successive valuesproduced. For example, in this embodiment, using the Varian VNMRsoftware, the parameter “sslsfrq” specifies a notch filter value thataffects the minimization of the sum of the areas above and below thebaseline. Brent's method, as described in Brent, R. P. 1973, Algorithmsfor Minimization without Derivatives (Englewood Cliffs, N.J.:Prentice-Hall), Chapter 5, [1], for example may be used to find anoptimum value for “sslsfrq”.

[0058] Referring back to FIG. 2, after filtering the selected regionblock 68 is invoked to automatically phase the entire spectrum and makethe peaks as symmetrical as possible. This may be done iteratively, forexample, by adjusting the real and imaginary components of thetransformed FID data until the resulting spectrum has positive, welldefined peaks. In this embodiment, employing the Varian VNMR software,this is achieved by invoking block 84 which calls the “aph0” command ofthe VNMR software. Some versions of the VNMR software may require morethan one successive execution of the aph0 command.

[0059] After automatic phasing parameters of the spectrum have beenproduced, optionally, a baseline correction block 69 may be executed toflatten out the baseline of the spectrum. Alternatively, baselinecorrection may be performed later. Baseline correction may be done byanalysing the spectrum to determine areas with peaks and areas devoid ofpeaks and setting areas devoid of peaks to have a common intensity valuesuch as zero, for example. An example of baseline correction availableat www.acdlabs.com/publish/nmr_ar.html published by Advanced ChemistryDevelopment Inc. of Toronto, Ontario, Canada.

[0060] Block 56 then causes the workstation 16 to save parametersproduced by the various sub-processes of block 54 in association withthe FID data and text, if desired. With the Varian VNMR software thismay be achieved using the ‘svf($savefid)’ command.

[0061] Block 58 then directs the workstation 16 to produce a trace filecomprised of (x,y) values representing intensity versus frequency, byperforming a Fourier Transform on the FID data, using the parametersproduced as described above and associated with FID data. The trace fileis then transferred or transmitted to the spectrum analysis apparatus 14or is stored for later transfer to that apparatus.

Spectrum Analysis Apparatus

[0062] In the embodiment shown, the spectrum analysis apparatus (SAA) 14is a separate component and includes a Linux workstation configured toreceive the trace file representing the measured spectrum, from thespectrum producing apparatus 12. The spectrum analysis apparatus 14 isconfigured to receive and execute instructions embodied in computerreadable codes to carry out a process for identifying compounds in asample according to an embodiment of another aspect of the invention.The codes may be provided to the spectrum analysis apparatus through anyof the media described above including the CD-ROM 26, Floppy disk 28,internet 30, extranet, intranet, electrical 32, RF 34, and optical 36media and/or any other media capable of providing codes to the spectrumanalysis apparatus.

[0063] It will be appreciated that the workstation 16 may alternativelybe configured with both the codes to effect the process for producing ameasured spectrum shown in FIG. 2 and the codes to effect the processfor identifying compounds, or either of these. It is desirable however,to execute the process for identifying compounds at a computer otherthan the workstation 16, to enable the process for identifying compoundsto be executed while another sample is being subjected to the NMRprocess, for example.

Process for Identifying Compounds

[0064] Referring to FIG. 5, generally, the process for identifyingcompounds involves identifying representative reference spectra from aset of reference spectra associated with detectable compounds andselected according to a condition of the sample, which collectivelydefine a composite reference spectrum having features matching a set offeatures in a test spectrum produced from the sample. Once therepresentative reference spectra have been identified, compounds withwhich they are associated may be identified.

[0065] The compounds associated with respective reference spectra of theidentified set are the compounds that may be expected to be present inthe sample. Quantities of the compounds may be determined from theintensities of certain representative peaks in the test spectrum whichare associated with the compounds, relative to the intensity of a peakassociated with the chemical shift calibration standard compound whichis unaffected by the condition of the sample. A condition may be the pHof the sample, for example, and an accurate measurement of pH can beobtained from the test spectrum. Thus, given a test spectrum of a sampleand given a set of reference spectra, the process can identify andquantify compounds present in the sample. Alternatively, the conditionmay be temperature, osmality, salt concentration, chemical composition,or solvent, for example.

Reference Spectra

[0066] Before the process for identifying compounds can be carried out,a set of reference spectra for compounds to be detected in the samplemust be made available to the SAA 14. This can be done by storing datarelating to reference spectra associated with respective compounds andallowing the SAA 14 access to the data. An exemplary reference spectrumfor a given compound may initially be represented in the form ofintensity versus frequency (x,y) values, which may be representedgraphically. A reference spectrum for lactic acid is shown in FIG. 6,for example. It will be appreciated that such a spectrum may have aplurality of peaks and/or clusters of peaks 150, 152, 154, 156, 158,160, 162, 164, 166, 168, 170 superimposed upon a featureless background,such as noise 172. The resolution along the x axis is dependent upon thefrequency of the Magnet used in the Nuclear Magnetic Resonance Processemployed to acquire the sample. The peaks that are associated withlactic acid are found in first and second clusters 166 and 154. Theseclusters are centered at 1.322 ppm and 4.119 ppm respectively. The firstcluster is comprised of two peaks and the second cluster 154 iscomprised of four peaks.

[0067] A reference spectrum of the type shown in FIG. 6 can berepresented in various formats including mathematical representationssuch as Lorentzian equations which may specify peaks associated with thecompound the spectrum is intended to represent. Such equations have theform:${f(x)} = \frac{a\quad w^{2}}{w^{2} + {4\left( {x - c} \right)^{2}}}$

[0068] where: a represents amplitude of the peak

[0069] w represents width of the peak; and

[0070] c represents the center of the peak

[0071] Thus, for example, the two peaks associated with the clustercentered on 1.322 ppm may be specified by two sets of Lorentzian lineshape parameters a, w and c.

[0072] The Lorentzian line shape parameters for each peak associatedwith a given compound may be stored in a base reference spectrum recordembodied in an XML file as shown in FIGS. 7A and 7B, for example. Such afile may have fields 200, 202 and 204, for example, for storing compoundinformation, experiment information and cluster/peak informationrespectively. The compound information field may include sub-fields forstoring the name of the compound with which the record is associated,and the molecular weight of the compound, for example. The experimentfield may have sub-fields for storing information about the experiment,such as conditions under which the peak information about the compoundwas collected. This may include the pH of the solution that wasanalyzed, the temperature of the solution, the calibration referencecompound ratio, the concentration of the compound in the solution, atimestamp, a sourcefile name, the frequency of the magnet used in theNMR process, and the spectral width of the entire spectrum, for example.The cluster/peak information fields may include separate fields 206 and208 for each cluster (166 and 154 in FIG. 6).

[0073] Each cluster field 206 and 208 may include sub fields 210, 212,214, 216 and 218 for representing information relating to the protonnumber of the cluster, the quantification of the cluster, the Lorentzianline width adjustment of the cluster and first and second peak subfieldsrespectively. The first and second peak subfields may include fields220, 222, and 224 for representing offset center information, heightinformation and proton ratio information relating to a respective peakin the cluster, respectively.

[0074] Effectively, the Lorentzian line shape parameters (a) and (c) foreach peak may be stored in the height and offset center fields 222 and220 respectively and each peak in a given cluster is considered to havethe same width (w) which is specified by the contents of the Lorentzianline width adjust field 214 associated with the cluster.

[0075] Referring to FIG. 8, a process by which base reference spectrumrecords may be produced is shown generally at 230. The process beginswith block 232 representing the preparation of a liquid solutioncontaining a reference compound such as lactic acid, a calibrationcompound such as DSS and a condition indicator compound such asImidazole. The liquid solution is prepared to a carefully calibratedconcentration of the calibration compound at a carefully controlledtemperature and pH. This step is carried out in a laboratory, by a humanor by a mechanized process, for example.

[0076] Once the liquid solution containing the reference compound hasbeen produced, as shown in block 234, it is subjected to the NMR processcarried out by the apparatus 12 shown in FIG. 1 to produce FID data.

[0077] At block 236, the apparatus 12 subjects the FID data produced bythe NMR process to the process shown in FIG. 2, to produce a measuredreference spectrum.

[0078] Having obtained a measured reference spectrum, a process as shownin block 238 is initiated to identify the calibration compound andobtain calibration parameters. This process is shown in greater detailat 238 in FIG. 9. Referring to FIG. 9, the codes direct the SAA 14 toderive from the measured reference spectrum a characterization of thecalibration compound contained in the sample. This involves identifyinga position of a peak of the measured reference spectrum that meets a setof criteria that associate the peak with the calibration compound andfurther involves producing parameters for a mathematical model of thepeak, that best represents the peak. Thus, in this embodiment thecharacterization is a list of Lorentzian line shape parameters (w, c anda) representing width, peak position and center amplitude respectivelyof a Lorentzian curve that best describes a feature, that is, a peak, ofthe measured reference spectrum, that is associated with the calibrationcompound. It will be appreciated that other characterizations could beused, such as those produced by peak picking, linear least squaresfitting, the Levenberg-Marquardt method, or a combination of thesemethods.

[0079] To find a peak associated with the calibration compound and toproduce a list of Lorentzian line shape parameters that characterize it,the SAA 14 is programmed with codes that include a first block 250 thatdirects the SAA 14 to determine a noise level at a pre-defined area ofthe measured reference spectrum. In this embodiment, it is known that anarea on the x-axis (frequency) corresponding to positions 64,000 and65,000 for example can be expected to be void of peaks and contain onlynoise. The standard deviation of the y-value (signal intensity) overthis region of the measured reference spectrum is representative of thenoise level of the entire spectrum and provides a measure of the noiselevel.

[0080] Next block 252 directs the SAA 14 to scan the measured referencespectrum in the negative x-direction beginning at the higher order endof the spectrum, to find a y-value that meets a certain criterion. Forexample, the criterion may be that the y-value must exceed the noiselevel by a pre-determined amount, such as a factor of 10, at the top ofa peak. A y-value meeting this criterion is assumed to be associatedwith an x-value that represents the position of a peak associated withthe calibration compound.

[0081] Block 254 then directs the SAA 14 to employ the x-valuerepresenting the approximate position of the calibration peak in thetest spectrum in a fitting algorithm that fits a curve to thecalibration peak and specifies width, height and position values. Forexample a Lorentzian line shape-fitting algorithm may be employed toproduce Lorentzian line shape parameters (a, w and c) that define aLorentzian line shape that best matches the calibration peak.

[0082] Referring back to FIG. 8, having calculated Lorentzian line shapeparameters that identify and characterize the calibration compound,block 240 is carried out to associate other input data with the measuredreference spectrum. Other input data may include information associatedwith the name and experiment fields 200 and 202 and information such asthe number of protons (proton number in XML file) for each cluster andthe proton ratio for each peak, for example.

[0083] Next at block 242, the measured reference spectrum ischaracterized by employing the well-known Conjugate Gradient method todetermine Lorentzian line shape parameters (a, w and c) for each peak orto determine sets of such parameters that define a mathematical model ormodels of peaks that best fits the important peaks of the measuredreference spectrum.

[0084] At block 244, a base reference spectrum record of the type shownin FIGS. 7A and 7B is produced from the other input data and thecharacterization of the spectrum. At block 246, the base referencespectrum record is stored in a reference record library, whicheffectively includes a plurality of reference records for variousdifferent reference compounds. For example, the reference record librarymay include base reference spectrum records for: L-phenylalanine,L-Threonine , Glucose, Citric Acid, Creatinine, Dimethylamine, Glycine,Hippuric acid, L-alanine, L-Histidine, L-Lactic Acid, L-Lysine,L-Serine, Taurine, Trimethylamine, Trimethylamine-N-Oxide, Urea,L-Valine, and Acetone.

[0085] Reference records may include base reference records or derivedreference records. Base reference spectrum records may be produced byempirical processes as described above. New records known as derivedreference records may be produced by operating on data from basereference records, and represent derived reference spectra. Operating ondata may include interpolation and/or performing mathematicaloperations, and/or using a lookup table, for example. Thus, for example,a limited set of base reference spectrum records can be produced,including a record representing the spectrum for lactic acid at a pH of5.1, and a record for lactic acid at a pH of 5.45, for example. Aderived reference record representing the spectrum of lactic acid in asolution having a pH of 5.28, for example, can then be produced byperforming mathematical operations on the Lorentzian line shapeparameters specified by the base records associated with solutions at pH5.1 and pH 5.45 to interpolate values for a solution at a pH of 5.28.Thus, a derived set of reference records can be produced for solutionsof any pH, within a reasonable range, when required, thereby avoiding apriori production of base reference records for every pH condition. Aswill be appreciated below, this feature may be exploited by determiningthe pH value of a sample under test and using the determined pH value toproduce a set of derived reference records for use in identifyingcompounds present in the sample. In other words, reference records foruse in the process for identifying and quantifying compounds areselected from existing base reference records or are “selected” byproducing derived reference records, according to a condition of thesample. In this embodiment, the condition is pH.

Process for Identifying Compounds

[0086] After having produced a reference library of base referencespectrum records, the process of identifying and quantifying compoundsin a test sample can be carried out.

Process for Identifying and Qualifying Compounds

[0087] The process is shown generally at 300 in FIGS 10A and 10B andbegins with an optional first block of codes 302 that cause the SAA toperform a spectrum conditioning step.

Spectrum Conditioning

[0088] If the measured NMR spectrum of the test sample is of sufficientquality, it can be used directly in subsequent operations of the processdisclosed herein. However, usually, the measured spectrum will not be ofsufficient quality and will require further processing to condition itfor later use. This further conditioning may involve baseline correctionas described earlier, for example, to produce a conditioned spectrum.

[0089] Thus the following description will refer to a test spectrum,which may be the measured spectrum described above, if such measuredspectrum is of sufficient quality or it may be a conditioned spectrum. Ameasured spectrum having a corrected baseline, for example, would be anexample of a measured spectrum that would not need to be subjected tofurther processing to condition it. Usually however the process willinvolve producing a test spectrum from the measured spectrum.

Calibration Determination

[0090] After being provided with, or after producing, a test spectrum ofthe type described, the process involves block 304 to produce acharacterization of a calibration compound in the sample or block 306 todetermine a representation of a condition of the sample. These twofunctions can be done independently or the determination of thecondition of the sample can be determined after first characterizing thecalibration compound.

[0091] The process of characterizing the calibration compound generallyinvolves identifying a peak associated with a calibration compound, inthe test spectrum. This may involve identifying a peak meeting a set ofcriteria that associate the peak with the calibration compound. The peakassociated with the calibration compound may be characterized byproducing Lorentzian line shape parameters to represent the peak.

[0092] Block 304 relating to characterizing the calibration compoundinvolves a call to the process shown in FIG. 9 to cause the SAA 14 toproduce a set of Lorentzian values (a, w and c) which best represent thepeak associated with the calibration compound in the test spectrum.

Condition Factor Determination

[0093] Optionally, as shown by block 308, a separate measuring devicemay be used to measure the selected condition of the test sample. Inthis embodiment, the measured condition is pH which may be measured by aseparate pH meter to produce a pH condition value that may be suppliedto the SAA as indicated at “C” in FIG. 10, for use in later functions ofthe process.

[0094] If the condition value has not already been obtained desirablythe condition value can be derived from the test spectrum itself asshown at block 306. This is possible where the measured condition is pHbecause the identification of a peak associated with a pH indicatorcompound in a sample can be readily determined from the test spectrumand the Lorentzian line shape values that characterize therepresentation of the calibration compound in the test spectrum.

[0095] Referring to FIG. 11, a process for determining a pH conditionvalue from the test spectrum is shown generally at 310. Basically, theprocess involves identifying a position, height and width of a peakassociated with a condition reference compound in the test spectrum andthis may involve identifying a peak meeting a set of criteria thatassociate the peak with the condition reference compound. Once the peakis identified the measured condition value may be produced as a functionof the peak position and parameters of the sample medium, the parametersbeing the parameters that define the calibration compound.

[0096] To achieve this, in this embodiment, the codes include a block312 which directs the SAA 14 to employ the Lorentzian line shapeparameter (c) associated with the calibration compound to locate awindow in the test spectrum, where a peak associated with the pHindicator compound is expected to be. The window is then scanned alongthe x-axis (frequency) from left to right, for example, for a y-value(intensity) that is greater than the amplitude value specified by theLorentzian line shape parameter (a).

[0097] When a y-value meeting the above criteria is found, block 314causes the SAA 14 to execute a characterization algorithm to produce atleast a center value (c) representing the center of the peak associatedwith the pH reference compound. For example a Lorentzian curve algorithmmay be used to produce Lorentzian parameters a, w and c defining thepeak associated with the pH reference compound.

[0098] Block 316 then directs the SAA 14 to execute a modified pHtitration Equation as shown below, on the center value c and to usecertain parameters of the sample solvent, in the equation, to produce acondition value representing pH of the sample:${p\quad H} = {{pK}_{A} - {\log \left\lbrack \frac{\delta_{obs} - \delta_{A}}{\delta_{H\quad A} - \delta_{obs}} \right\rbrack}}$

[0099] where: δ_(obs) is the observed chemical shift (center c);

[0100] δ_(A) is the chemical shift of the conjugate base;

[0101] δ_(HA) is the chemical shift of the conjugate acid; and

[0102] δ_(KA) is an association constant for the conjugate base.

[0103] Assume that no matter what method of determining pH is used, a pHvalue of 5.28 is obtained for the sample. Referring Back to FIG. 10Bblock 320 directs the SAA to receive the condition value either producedexternally, such as by measurement or produced internally such as byusing the test spectrum as described above, to produce a derivedreference record representing a derived reference spectrum for use inlater functions of the process. Separate derived reference records maybe produced from corresponding base reference spectrum recordsassociated with corresponding compounds expected to be in the sample.Thus, in effect a representation of a set of derived reference spectramay be produced from a set of reference spectra and the measuredcondition value. In general, a process for producing a representation ofa spectrum for a hypothetical solution containing a compound, for use indetermining the composition of a test sample, involves producing aposition value for at least one peak of a reference spectrum as afunction of the measured condition of the test sample and a property ofthe at least one peak in a base reference spectrum. The property may bea position of a peak, amplitude of the peak or width of the peak forexample. In this embodiment, a derived reference record is used torepresent a representation of a spectrum for the hypothetical solution.

[0104] Referring to FIG. 12, producing a derived reference record mayinvolve accessing a pre-defined record specifying peaks in a referencespectrum and adjusting a position value in the record, the positionvalue being the position value of the at least one peak. This may bedone by block 322 which causes the SAA to identify a base referencespectrum record that is associated with a condition nearest to themeasured condition of the sample and to use such reference spectrum asthe derived reference spectrum.

[0105] Producing a position value for a peak may involve interpolating aposition value from position values associated with base referencespectra associated with condition values above and below the measuredcondition value associated with the sample. For example, block 324 maybe employed to cause the SAA 14 to produce a position value bycalculating the position value as a function of pH of the sample and toeffectively produce or interpolate a derived reference spectrum.

[0106] To interpolate a derived reference spectrum, assume that at block322 a base reference record for lactic acid at a pH of 5.10 is locatedas being the base reference spectrum record for lactic acid that isnearest to the pH of the sample, 5.28. Such a record is shown in FIGS.7A and 7B. Referring back to FIG. 12, block 324 may direct the SAA 14 tofind another base reference spectrum record for lactic acid that isassociated with a pH value greater than the pH of the sample. Assumethat it locates a base reference spectrum record associated lactic acidat a pH of 5.45. A record of this type is shown in FIGS. 13A and 13B. Onlocating this second base reference spectrum record, block 324 directsthe SAA 14 to create a new derived reference spectrum record for lacticacid at a pH of 5.28. To do this the SAA 14 is directed to make a copyof the base reference spectrum record associated with a pH of 5.45 andthen to replace the frequency values for the center position of eachcluster shown in that record, with interpolated values. A simple linearinterpolation is used to find the value 1.3202 for the first cluster andthe value 4.1149 for the second cluster. FIGS. 14A and 14B show theresulting derived reference spectrum record for a pH of 5.28, for lacticacid, produced using this method. Similarly, derived reference spectrumrecords are produced for each compound in the reference library toproduce derived reference records for a pH of 5.28 for each compoundrepresented in the library.

[0107] Alternatively; adjusting the position of a peak may involvelocating a measured condition value dependent function in a basereference record, or pre-defined record, producing the position valuefrom the function and associating the position value with thepre-defined record. Associating may involve storing the position valuein the pre-defined record, for example. To effect this method ofadjusting the position of a peak, a generic type of derived record maybe kept, in which equations, effectively specifying the centerPPM valuesfor the two clusters as a function of pH may be provided in the fieldassociated with the centerPPM value for each cluster, as shown in FIGS.15A and 15B. Then, whenever a pH value is found from a sample, a copy ofthe record can be made and the pH value may be used in the equations inthe copied record to produce centerPPM values. These center PPM valuescan then be substituted for the respective equations that produced them,in the copied record, thereby producing a new derived record for use inlater calculations.

[0108] Alternatively, producing a position value may involve producingthe position value by addressing a lookup table of position values withthe measured condition value of the sample. For example the positionvalue of a peak may be adjusted by locating, in a pre-defined record, alink to a lookup table specifying peak positions for various conditionvalues, retrieving the position value from the lookup table andassociating the position value with the pre-defined record. To do this asecond generic type of derived record may be kept, in which lookup tablelinks, effectively specifying links to lookup tables (not shown) thatreturn centerPPM values for input pH values may be provided in the fieldassociated with the centerPPM value for each cluster, as shown in FIGS.16A and 16B. Then, whenever a pH value is found from a sample, a copy ofthe record can be made and the pH value may be used to address thelookup tables associated with the links specified in the record toproduce centerPPM values. These center PPM values can then besubstituted for the respective links that produced them, in the copiedrecord, thereby producing a new derived record for use in latercalculations.

[0109] Referring back to FIG. 10B, after having produced a derivedreference spectrum for each compound that is likely to be in the sample,block 326 causes the SAA 14 to calibrate the Lorentzian line widthvalues for the derived reference spectrum relative to the test spectrumto provide for a better fit to the test spectrum. To do this, block 326may direct the SAA 14 to calibrate to the (a, c and w) values associatedwith the calibration compound in the sample, the spectral linewidths ofpeaks associated with each of the reference compounds. In thisembodiment block 326 may direct the SAA 14 to employ the contents of theLorentzian width adjust field 214 of each derived reference spectrumrecord to produce respective absolute values representing actuallinewidths relative to the calibration compound linewidth. Thesemodified spectral line widths may be associated with respective peaks inthe same cluster of each reference compound, by storing these modifiedspectral line widths in an internal data structure (not shown) thatassociates modified spectral information with derived reference records.

[0110] Still referring to FIG. 10B, optionally, compound specificadjustments as shown by block 328 may be made to the contents of thefields of the derived reference records, where it is known, for examplethat certain effects occur when certain reference compounds are presentin the test sample. For example, the shift of peaks associated withcitrate is affected by the presence or absence of certain divalentcations and therefore the process may include a compound-specificadjustment to compensate for shifts known to occur when the presence ofsuch divalent cations is known. Other compound-specific adjustments maybe made to compensate for shifts due to temperature, chemicalinteractions, dilution effect and other ligand effects.

Cluster Centering

[0111] Still referring to FIG. 10B the process may further involve acluster centering step as shown at 330 for shifting the derivedreference spectrum in frequency (x-direction) to better align it withthe test spectrum. This may involve producing a cluster positionindicator for a derived reference spectrum, which causes the positionsof peaks in the derived reference spectrum to match corresponding peaksin the test spectrum. A cluster position indicator already associatedwith the derived reference spectrum may be used or a cluster positionindicator that produces a match of the derived reference spectrum to thetest spectrum to a defined degree may be derived from the clusterposition indicator already associated with the derived referencespectrum. In the embodiment shown, producing a cluster center indicatoris achieved by attempting to fit the cluster to the test spectrum. To dothis, cluster center values around the cluster center value alreadyassociated with the derived reference spectrum are assigned to thederived reference spectrum and used to effectively shift the derivedreference spectrum to the left and right of the current cluster centervalue. For example, cluster center values +/−0.001 ppm points aresuccessively assigned to the derived reference spectrum to successivelyshift the center of the derived reference spectrum at successive pointsin a window extending −0.003 ppm to +0.003 ppm from the currentlyassigned cluster center. At each point, the derived reference spectrumis used in a Levenberg-Marquardt (LM) fitting algorithm that determinesa correlation value for each position of the center of the derivedreference spectrum in the window. The center position that causes the LMfitting algorithm to produce the best correlation value is thenassociated with the derived reference spectrum correlation value and isused in later calculations. Thus in effect, the derived referencespectrum is “wiggled” into alignment with the test spectrum. Thiswiggling is done independently for each cluster of peaks in the derivedreference spectrum.

Upper Bound Concentration Estimates

[0112] Still referring to FIG. 10B, in this embodiment, the process foridentifying and quantifying further involves block 332 which causes theSAA 14 to produce an upper bound estimate of a quantity of a compoundassociated with a derived reference spectrum, for use in a least squaresalgorithm later in the process. In general, producing an upper boundconcentration estimate comprises selecting as the upper boundconcentration estimate, a lowest concentration value selected from aplurality of concentration values calculated from respective peaks inthe test spectrum. This may involve finding the height of a peak in thetest spectrum that corresponds to a peak in the reference spectrum anddetermining a concentration value for the peak as a function of itsheight. Prior to determining a concentration estimate for a peak, theprocess may involve predicting whether the height of a peak in the testspectrum is greater than a threshold level and deciding not to determinea concentration for the peak when the height is less than the thresholdlevel.

[0113] Referring to FIG. 17 a process implemented by program codesoperating on the SAA 14 of FIG. 1, for producing an upper boundconcentration estimate is shown generally at 340. A first block 342causes the SAA 14 to select a reference record. Next block 344 causesthe SAA 14 to sort by height those peaks in the reference record thathave a quantification value equal to 1. This causes the process toconsider only those peaks that provide reliable concentration estimates.Next, block 346 directs the SAA to address the (next) highest peak ofthose that have just been sorted at block 344. Reference is made to the“next” high peak because the peaks are considered in succession. On thefirst pass through the process however, the highest peak found in thesort is the first peak addressed.

[0114] Next block 348 causes the SAA 14 to use the position of thecurrently addressed peak in the reference spectrum to locate acorresponding peak in the test spectrum. This may involve looking for apeak in a window positioned at a corresponding position in the testspectrum. On finding such a peak, the maximum intensity value (max(y))associated with that peak is found.

[0115] At block 350, the SAA 14 is directed to calculate a concentrationvalue as a function of the max (y) value, using the following equation:$\begin{matrix}{{Ct} = \frac{{adjustedwidth}*{\max (y)}*{dssconcentration}*{dssprotonratio}}{{Dssheight}*{peakprotonratio}}} & (17)\end{matrix}$

[0116] Where: Ct is the concentration value for the peak

[0117] adjustedwidth is the width of the peak as determined from

[0118] the variable w calculated as shown in FIG. 9 and the Lorentzianwidth adjust value stored in the reference record

[0119] max(y) is the maximum y-value associated with the correspondingpeak in the test spectrum

[0120] dssconcentration is the concentration of DSS in the sample 0.5mM, for example

[0121] dssprotonratio is the DSS proton ratio (9, for example)

[0122] Dssheight is the DSS height value a, calculated as shown in FIG.9

[0123] Peakprotonratio is the proton ratio of the peak, as indicated inthe reference record.

[0124] At block 352 SAA 14 is directed to determine whether thecurrently calculated concentration value is less than the previouslycalculated value. If so, then block 354 causes the SAA 14 to set apreliminary upper bound concentration value to the current concentrationvalue. If at block 352, the currently calculated concentration value isnot less than the previously calculated value, the preliminary upperbound concentration estimate value remains at its former value. Theeffect of blocks 352 and 354 is to cause the preliminary upper boundconcentration estimate to be set to the lowest concentration valuecalculated for any of the peaks.

[0125] Once the preliminary value has been determined from the currentpass, block 356 directs the SAA 14 to determine whether all peaks withquantification values of 1 have been considered. If so, the SAA 14 isdirected to optional block 357 in FIG. 17. If not, the SAA 14 isdirected to block 358 which causes the SAA to calculate the expectedheight of the next peak associated with the compound, in the testspectrum. To do this equation 17 above is solved for max(y) using thecurrent preliminary concentration estimate, and the Lorentzian widthadjust value, and the peak proton ratio of the next highest peak fromthe list of sorted peaks. Then, block 359 in FIG. 17 causes the SAA 14to determine whether the max(y) value so found is less than the noiselevel of the spectrum. (noise level was calculated at block 250 in FIG.9). If not, then the next peak is worth considering and the SAA 14 isdirected to resume processing at block 346 to address the next highestpeak in the sorted list.

[0126] If the estimated height of the next highest peak found at block358 is less than the noise level of the spectrum, the SAA 14 is directedto an optional block 357 which increases the amplitude of thepreliminary concentration estimate value by the amplitude of the noisein the test spectrum to produce a true estimate of the upper boundconcentration limit for the compound. This is useful where concentrationvalues are very low.

[0127] Then, finally, block 355 directs the SAA 14 to associate the trueupper bound concentration estimate with the reference record, such as bystoring the upper bound concentration estimate value in a field (notshown) of the record, or in a field of a data structure maintained inthe SAA 14 to create such associations.

Least squares Fitting

[0128] Referring back to FIG. 10B, the process for identifying andquantifying compounds involves a block 334 which causes the SAA 14 toperform a least squares fitting algorithm using all of the derivedreference records and the test spectrum to produce scaling values foreach peak in each reference spectrum such that when all peaks from allreference spectra are summed they produce a composite spectrum that bestmatches the test spectrum.

[0129] Referring to FIG. 18, the least squares fitting routine includesa first block 360 which causes the SAA 14 to produce “signature” spectracomprised of (x,y) pairs that define a composite spectrum representativeof the sum of all Lorentzians in a given derived reference record. Aseparate signature spectrum is produced for each derived referencerecord. Thus a separate (x,y) array is produced for each derivedreference record.

[0130] Block 362 then provides each signature spectrum, upper boundconcentrations and the (x,y) array representing the test spectrum to aLinear Least Squares fitting routine, which in this embodiment is LS SOLlicensed from Stanford University of California, U.S.A. This routinereturns scaling factors for each peak in each applicable referencerecord, such that when the scaled Lorentzian models specified in allapplicable reference records are summed together to make a compositespectrum, the composite spectrum has features matching features in thetest spectrum produced from the sample. These scaling factors thusidentify representative reference spectra from a set of referencespectra associated with detectable compounds and selected according tothe measured condition of the sample.

[0131] In this embodiment, an indication of compounds associated withreference spectra having peaks that when scaled by the scaling factorshave a height greater than a threshold may be produced. This may involveproducing a list of compounds, for example. Thus, scaled peaks having aheight less than the threshold may indicate that the presence of theassociated compound in the sample is questionable and therefore suchcompound should not be listed as being present in the sample.

[0132] Block 364 then causes the SAA 14 to employ these scaling factorsin the following equation to quantify each compound by producingconcentration values for each compound represented by a referencerecord:

Conc.=(DSSRatio*scalingFactor*cdb)/p×DSS

[0133] Where: Conc.: concentration of the given compound in the sample

[0134] DSSRatio: the DSSRatio entry for the given compound (see field202 in FIG. 7A)

[0135] scalingFactor: the scaling factor of the highest peak in thegiven compound (from least squares fitting)

[0136] cdb: the concentration of the given database entry (see field 202in FIG. 7A)

[0137] pxDSS: the pixel height of DSS in the spectrum (the value a asdetermined by the process shown in FIG. 9)

[0138] Block 366 then causes the SAA 14 to associate these concentrationvalues with the compounds associated with the derived reference records.

[0139] Block 368 then causes the SAA 14 to produce a list or indicationof compounds in the sample, along with their associated concentrationvalues. This list may be printed and/or displayed on a monitor, forexample. Concentration values may be expressed in moles, mmol/L, g/L ormoles/mole, for example and absolute quantities may be obtained by asimple equation converting concentration to absolute quantity values, inmoles, for example.

[0140] While specific embodiments of the invention have been describedand illustrated, such embodiments should be considered illustrative ofthe invention only and not as limiting the invention as construed inaccordance with the accompanying claims.

What is claimed is:
 1. A computer-implemented process for automaticallyidentifying compounds in a sample mixture, the process comprising:receiving a representation of a measured condition of the samplemixture; using said representation of a measured condition of the samplemixture to select a set of reference spectra of compounds suspected tobe contained in said sample mixture, from a library of referencespectra; receiving a representation of a test spectrum having peaksassociated with compounds therein, said test spectrum being producedfrom the sample mixture under said measured condition; and combiningreference spectra from said set of reference spectra to produce amatching composite spectrum having peaks associated with at least someof said suspected compounds, that match peaks in said test spectrum, thecompounds associated with the reference spectra that combine to producethe matching spectrum being indicative of the compounds in the samplemixture.
 2. The process of claim 1 further comprising identifyingcompounds associated with said representative reference spectra.
 3. Theprocess of claim 2 wherein identifying compounds comprises identifyingquantities of said compounds.
 4. The process of claim 2 whereinidentifying compounds comprises identifying concentrations of saidcompounds.
 5. The process of claim 1 further comprising identifying apeak associated with a calibration compound, in said test spectrum. 6.The process of claim 5 wherein identifying a peak comprises identifyinga peak meeting a set of criteria that associate said peak with saidcalibration compound.
 7. The process of claim 5 wherein identifyingcomprises producing Lorentzian line shape parameters to represent saidpeak.
 8. The process of claim 1 further comprising receiving a measuredcondition value representing a measured condition of the sample.
 9. Theprocess of claim 8 further comprising producing said measured conditionvalue.
 10. The process of claim 9 wherein producing said measuredcondition value comprises measuring pH of the sample.
 11. The process ofclaim 9 wherein producing said condition value comprises producing saidcondition value from said test spectrum.
 12. The process of claim 11wherein producing said condition value comprises identifying in saidtest spectrum a peak position associated with a condition referencecompound.
 13. The process of claim 12 wherein identifying a peakposition comprises identifying a peak meeting a set of criteria thatassociate said peak with said condition reference compound.
 14. Theprocess of claim 12 wherein producing said condition value comprisesproducing said condition value as a function of said peak position andparameters of a sample solvent.
 15. The process of claim 9 whereinproducing said condition value comprises determining a pH value for thesample, from said test spectrum.
 16. The process of claim 15 whereindetermining a measured pH value comprises determining from said testspectrum, the location of a peak associated with a pH referencecompound, in relation to a peak associated with a calibration referencecompound.
 17. The process of claim 16 wherein producing said conditionvalue comprises producing said condition value as a function of saidpeak location and parameters of a sample solvent.
 18. The process ofclaim 8 further comprising adjusting a set of base reference spectraaccording to said condition value, to produce said set of referencespectra.
 19. The process of claim 18 wherein adjusting said set of basereference spectra comprises adjusting parameters of said base referencespectra according to a pH of the sample.
 20. The process of claim 8further comprising producing a derived reference spectrum in response tosaid measured condition value and a reference spectrum.
 21. The processof claim 20 whrein producing the derived reference spectrum comprisesidentifying a reference spectrum that is associated with a conditionvalue nearest to said measured condition value.
 22. The process of claim21 wherein producing the derived reference spectrum comprises deriving avalue from at least one reference spectrum that is associated with acondition value nearest to said measured condition value.
 23. Theprocess of claim 22 wherein producing the derived reference spectrumcomprises performing mathematical operations on parameters of areference spectrum to produce new parameters for use as parameters ofsaid derived reference spectrum.
 24. The process of claim 20 furthercomprising identifying in said test spectrum a peak associated with acalibration compound and producing Lorentzian line shape parameters,including a line width parameter, to represent said peak.
 25. Theprocess of claim 24 wherein said derived reference spectrum isrepresented by at least one set of Lorentzian line shape parametersincluding a line width parameter, said process further comprisingcalibrating said line width parameter associated with said derivedreference spectrum relative to a line width parameter associated withsaid calibration compound.
 26. The process of claim 25 whereinidentifying representative reference spectra comprises adjusting aparameter of said at least one derived reference spectrum until said atleast one derived reference spectrum best aligns with said testspectrum.
 27. The process of claim 25 wherein identifying repres ntativreference spectra comprises producing a cluster position indicator for aderived reference spectrum, which causes the positions of peaks in saidderived reference spectrum to match corresponding peaks of said testspectrum.
 28. The process of claim 25 wherein identifying representativereference spectra comprises producing an upper bound concentrationestimate of a quantity of a compound associated with the derivedreference spectrum.
 29. The process of claim 28 wherein producing anupper bound concentration estimate comprises selecting as said upperbound concentration estimate, a lowest concentration value selected froma plurality of concentration values computed for respective peaks in thetest spectrum.
 30. The process of claim 29 wherein producing said upperbound concentration estimate comprises finding the height of a peak inthe test spectrum that corresponds to a peak in the reference spectrum.31. The process of claim 30 wherein producing said upper boundconcentration estimate comprises determining a concentration value for apeak as a function of said height of said peak.
 32. The process of claim31 producing said upper bound concentration estimate comprisespredicting whether said height of a peak in the test spectrum is greaterthan a threshold level and not determining a concentration for said peakwhen said height is less than said threshold level.
 33. The process ofclaim 25 further comprising adjusting the relative positions of peaksassociated with one of said compounds according to pre-defined criteria.34. The process of claim 25 wherein identifying repres ntative referencespectra comprises determining scaling factors for peaks in a pluralityof reference spectra such that the sum of said peaks scaled by saidscaling factors optimally matches said test spectrum.
 35. The process ofclaim 34 further comprising determining concentrations of compoundsassociated with said reference spectra as a function of said scalingfactors.
 36. The process of claim 35 further comprising producing anindication of compounds associated with reference spectra having peaksthat when scaled by said scaling factors have a height greater than athreshold.
 37. The process of claim 35 further comprising outputting avalue representing at least one of said concentrations.
 38. The processof claim 1 further comprising receiving said test spectrum from aspectrum measurement device.
 39. The process of claim 1 furthercomprising doping the sample with a condition indicator.
 40. The processof claim 39 wherein doping comprises doping the sample with a pHindicator.
 41. The process of claim 40 further comprising doping thesample with a chemical shift reference compound.
 42. The process ofclaim 41 further comprising employing Nuclear Magnetic Resonance (NMR)to produce free induction decay data operable to be transformed into anNMR spectrum operable to be used as th test spectrum.
 43. The process ofclaim 1 further comprising receiving Nuclear Magnetic Resonance (NMR)free induction decay (FID) data and processing said FID data to producea representation of a measured spectrum having well defined Lorentzianlines, a flat baseline and peaks that have positive well-defined areas.44. The process of claim 43 further comprising producing said testspectrum from said measured spectrum.
 45. The process of claim 44wherein producing said test spectrum comprises producing a conditionedspectrum.
 46. The process of claim 45 wherein producing said conditionedspectrum comprises producing a baseline corrected spectrum from saidmeasured spectrum.
 47. A computer-readable medium for providing computerreadable instructions for directing a processor circuit to identifycompounds in a sample, the instructions comprising: a set of codesoperable to cause the processor circuit to receive a representation of ameasured condition of the sample mixture; a set of codes operable tocause the processor circuit to use said representation of a measuredcondition of the sample mixture to select a set of reference spectra ofcompounds suspected to be contained in said sample mixture, from alibrary of reference spectra; a set of codes operable to cause theprocessor circuit to receive a representation of a test spectrum,produced from the sample mixture under said measured conditions; and aset of codes operable to cause the processor circuit to combinereference spectra from said set of reference spectra to produce amatching composite spectrum having peaks representing at least some ofsaid suspected compounds, that match peaks said test spectrum, thecompounds associated with the reference spectra that combine to producethe matching spectrum being the compound in the sample mixture.
 48. Asignal embodied in a carrier wave, said signal comprising: a codesegment operable to cause a processor circuit to receive arepresentation of a measured condition of the sample mixture; a codesegment operable to cause a processor circuit to use said representationof a measured condition of the sample mixture to select a set ofreference spectra of compounds suspected to be contained in said samplemixture, from a library of reference spectra; a code segment operable tocause a processor circuit to receive a representation of a testspectrum, produced from the sample mixture under said measuredconditions; and a code segment operable to cause a processor circuit tocombine reference spectra from said set of reference spectra to producea matching composite spectrum having peaks representing at least some ofsaid suspected compounds, that match peaks said test spectrum, thcompounds associated with the reference spectra that combine to producethe matching spectrum being the compound in the sample mixture.
 49. Anapparatus for identifying compounds in a sample, the apparatuscomprising a processor circuit programmed to: receive a representationof a measured condition of the sample mixture; use said representationof a measured condition of the sample mixture to select a set ofreference spectra of compounds suspected to be contained in said samplemixture, from a library of reference spectra; receive a representationof a test spectrum, produced from the sample mixture under said measuredconditions; and combine reference spectra from said set of referencespectra to produce a matching composite spectrum having peaksrepresenting at least some of said suspected compounds, that match peakssaid test spectrum, the compounds associated with the reference spectrathat combine to produce the matching spectrum being the compound in thesample mixture.
 50. An apparatus for identifying compounds in a sample,the apparatus comprising: means for receiving a representation of ameasured condition of the sample mixture; means for using saidrepresentation of a measured condition of the sample mixture to select aset of reference spectra of compounds suspected to be contained in saidsample mixture, from a library of reference spectra; means for receivinga representation of a test spectrum, produced from the sample mixtureunder said measured conditions; and means for combining referencespectra from said set of reference spectra to produce a matchingcomposite spectrum having peaks representing at least some of saidsuspected compounds, that match peaks said test spectrum, the compoundsassociated with the reference spectra that combine to produce thematching spectrum being the compound in the sample mixture.
 51. Acomputer-implemented process for producing a trace file for use inspectrum analysis, the method comprising: performing a Fourier Transformon Free Induction Decay (FID) data to produce an initial spectrum;filtering a selected region of said initial spectrum to produce afiltered spectrum; and phasing said filtered spectrum to produce ameasured spectrum having a flat baseline and well defined positivepeaks.
 52. The method of claim 51 wherein filtering comprises applying anotch filter to said selected region to suppress a peak associated witha contaminant in said contaminant region.
 53. The method of claim 52wherein applying a notch filter comprises producing an adjusted set ofnotch filter parameters and applying a notch filter employing saidadjusted set of notch filter parameters to said selected region.
 54. Themethod of claim 53 wherein applying a notch filter comprises iterativelyadjusting said set of notch filter parameters and applying said adjustednotch filter parameters to a notch filter and applying said notch filterto said selected region until a sum of the absolute values of areasdefined by peaks above and below a baseline of said initial spectrum isminimized.
 55. The method of claim 51 wherein phasing said adjustedspectrum comprises adjusting real and imaginary components of saidfiltered spectrum until said filtered spectrum has all positive, welldefined peaks.
 56. The method of claim 51 wherein performing a fouriertransform comprises performing a weighted Fourier Transform with weightsthat provide for enhancement of said initial spectrum.
 57. The method ofclaim 56 wherein performing a weighted Fourier Transform comprisesemploying weights that perform a line broadening function to saidinitial spectrum.
 58. The method of claim 51 further comprising definingthe 'size of a window on said initial spectrum.
 59. The method of claim58 wherein defining the size of a window comprises scaling said initialspectrum.
 60. The method of claim 51 further comprising correcting saidinitial spectrum for drift effects.
 61. The method of claim 51 furthercomprising performing baseline correction on said measured spectrum. 62.A computer readable medium for providing codes operable to direct aprocessor circuit to produce a trace file for use in spectrum analysis,the computer readable medium comprising: codes for automatically causingthe processor circuit to perform a Fourier Transform on Free InductionDecay (FID) data to produce an Initial spectrum; codes for automaticallycausing the processor circuit to filter a selected region of saidinitial spectrum to produce a filtered spectrum; and codes forautomatically causing the processor circuit to phase said filteredspectrum to produce a measured spectrum having a flat baseline and welldefined positive peaks.
 63. An apparatus for producing a trace file foruse in spectrum analysis, the apparatus comprising: means forautomatically performing a Fourier Transform on Free Induction Decay(FID) data to produce an initial spectrum; means for automaticallyfiltering a selected region of said initial spectrum to produce afiltered spectrum; and means for automatically phasing said filteredspectrum to produce a measured spectrum having a flat baseline and welldefined positive peaks.
 64. A signal for causing a processor circuit toproduce a trace file for use in spectrum analysis, the signal including:a first segment comprising codes for automatically causing saidprocessor circuit to perform a Fourier Transform on Free Induction Decay(FID) data to produce an initial spectrum; a second segment comprisingcodes for automatically causing the processor circuit to filter aselected region of said initial spectrum to produce a filtered spectrum;and a third segment comprising codes for automatically causing theprocessor circuit to phase said filtered spectrum to produce a measuredspectrum having a flat baseline and well defined positive peaks.
 65. Acomputer-implemented process for producing a representation of aspectrum for a hypothetical solution containing a compound, for use indetermining the composition of a test sample, the process comprising:producing a position value for at least one peak of a reference spectrumas a function of a measured condition of the test sample, and a propertyof said at least one peak in a base reference spectrum
 66. The processof claim 65 wherein producing a position value comprises interpolatingsaid position value from position values associated with base referencespectra associated with a condition nearest to said measured condition.67. The process of claim 65 wherein producing a position value comprisescalculating said position value as a function of pH of said sample. 68.The process of claim 65 wherein producing a position value comprisesproducing said position value by addressing a lookup table of positionvalues with a measured condition value representing said condition ofsaid sample.
 69. The process of claim 65 further comprising accessing apre-defined record specifying peaks in a reference spectrum andadjusting a position value in said record, said position value beingsaid position value of said at least one peak.
 70. The process of claim69 wherein adjusting comprises locating a condition value dependentfunction in said pre-defined record, producing said position value fromsaid function and associating said position value with said pre-definedrecord.
 71. The process of claim 70 wherein associating comprisesstoring said position value in said pre-defined record.
 72. The processof claim 69 wherein adjusting comprises locating in said pre-definedrecord a link to a lookup table specifying peak positions for variousconditions and retrieving said position value from said lookup table andassociating said position value with said pre-defined record.
 73. Theprocess of claim 72 wherein associating comprises storing said positionvalue in said pre-defined record.
 74. A computer-readable medium forproviding computer readable instructions for causing a processor circuitto produce a representation of a spectrum for a hypothetical solutioncontaining a compound, for use in determining the composition of a testsample, the instructions comprising: a set of codes for directing theprocessor circuit to produce a position value for at least one peak of areference spectrum as a function of a measured condition of the testsample, and a property of said at least one peak in a base referencespectrum.
 75. A signal operable to cause a processor circuit to producea representation of a spectrum for a hypothetical solution containing acompound, for use in determining the composition of a test sample, thesignal comprising a signal segment comprising codes operable to causethe processor circuit to produce a position value for at least one peakof a reference spectrum as a function of a measured condition of thetest sample, and a property of said at least one peak in a basereference spectrum.
 76. An apparatus for producing a representation of aspectrum for a hypothetical solution containing a compound, for use indetermining the composition of a test sample, the apparatus comprising aprocessor circuit programmed to produce a position value for at leastone peak of a reference spectrum as a function of a measured conditionof the test sample, and a property of said at least one peak in a basereference spectrum.
 77. An apparatus for producing a representation of aspectrum for a hypothetical solution containing a compound, for use indetermining the composition of a test sample, the apparatus comprising:means for receiving a measured condition value representing a conditionof the test sample; means for receiving a representation of a positionof at least one peak in a base reference spectrum; and means forproducing a position value for at least one peak of a derived referencespectrum as a function of said measured condition value of the testsample, and the position of said at least one peak in a base referencespectrum.